Efficient Maintenance and Recovery of Data Warehouses

نویسندگان

  • Dallan Quass
  • Jennifer Widom
چکیده

Data warehouses collect data from multiple remote sources and integrate the information as materialized views in a local database. The materialized views are used to answer queries that analyze the collected data for patterns, anomalies, and trends. This type of query processing is often called on-line analytical processing (OLAP). So that OLAP queries can be posed and answered easily, the data from the remote sources is \cleansed" and translated to a common schema. The warehouse views must be updated when changes are made to the remote information sources. Otherwise, the answers to OLAP queries are based on stale data. Answering OLAP queries based on stale data is clearly a problem especially if (answers to) OLAP queries are used to support critical decisions made by the organization that owns the data warehouse. Because the primary purpose of the data warehouse is to answer OLAP queries, only a limited amount of time and/or resources can be devoted to the warehouse update. Hence, we have developed new techniques to ensure that the warehouse update can be done e ciently. Also, the warehouse update is not devoid of failures. Since only a limited amount of time and/or resources are devoted to the warehouse update, it is most likely infeasible to restart the warehouse update from scratch. Thus, we have developed new techniques for resuming failed warehouse updates. Finally, warehouse updates typically transfer gigabytes of data into the warehouse. Although the price of disk storage is decreasing, there will be a point in the \lifetime" of a data warehouse when keeping and administering all of the collected is unreasonable. Thus, we have investigated techniques for reducing the storage cost of a data warehouse by selectively \expiring" information that is not needed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Lord of the Rings: Efficient Maintenance of Views at Data Warehouses

Data warehouses have become extremely important to support online analytical processing (OLAP) queries in databases. Since the data view that is obtained at a data warehouse is derived from multiple data sources that are continuously updated, keeping a data warehouse up-to-date becomes a crucial problem. An approach referred to as the incremental view maintenance is widely used. Unfortunately, ...

متن کامل

Utility of Ranking Warehouse Candidates in Workshop Locations Using UTAStar

Although the importance of locating in manufacturing and service companies is not a new issue, one of significance applications is to determine the appropriate location for warehouses in manufacturing workshops warehouses to the maintenance of materials or products. In any organizations, Finding the suitable site for warehouses establishments to increase customer service and efficiency is one o...

متن کامل

Online Expansion of Largescale Data Warehouses

Modern data warehouses store exceedingly large amounts of data, generally considered the crown jewels of an enterprise. The amount of data maintained in such data warehouses increases significantly over time—often at a continuous pace, e.g., by gathering additional data or retaining data for longer periods to derive additional business value, but occasionally also precipitously, e.g., when cons...

متن کامل

Complements for Data Warehouses

Views over databases have recently regained attention in the context of data warehouses, which are seen as materialized views. In this setting, efficient view maintenance is an important issue, for which the notion of self-maintainability has been identified as desirable. In this paper, we extend self-maintainability to (query and update) independence, and we establish an intuitively appealing ...

متن کامل

Issues in Developing Very Large Data Warehouses

The size of The Boeing Company posts some stringent requirements on data warehouse design and implementation. We summarize four interesting and challenging issues in developing very large scale data warehouses, namely failure recovery, incremental update maintenance, cost model for schema design and query optimization, and metadata definition and management. For each issue, we give the reasons ...

متن کامل

Epsilon Equitable Partition: On Scheduling Data Loading and View Maintenance in Soft Real-time Data Warehouses

Data warehouses contain historic data providing information for analytical processing, decision making and data mining tools. However, several business intelligence applications nowadays require access to real-time data to make sound decisions. As a consequence, there is a great demand to incorporate new data from sources to the data warehouse as fast as possible. That motivates the constructio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999